Knowledge Discovery on the Grid

نویسندگان

  • Lamine Aouad
  • Tahar Kechadi
چکیده

In the last few decades, Grid technologies have emerged as an important area in parallel and distributed computing. The Grid can be seen as a computational and large-scale support, and even in some cases as a high-performance support. In recent years, the data mining community have been increasingly using Grid facilities to store, share, manage and mine large-scale data-driven applications. Indeed, data mining and knowledge discovery applications are by nature distributed, and are using the Grid as their execution environment. This particularly led to a great interest of the community in distributed data mining and knowledge discovery on large Grid platforms. Many Grid-based Data Mining (DM) and Knowledge Discovery (KD) frameworks were initiated, and proposed different techniques and solutions for large-scale datasets mining. These include the ADMIRE project initiated by the PCRG (Parallel Computational Research Group) at the University College Dublin, the Knowledge Grid project at the University of Calabria, The GridMiner project at the University of Vienna, among others. These knowledge discovery1 frameworks on the Grid aim to offer high-level abstractions and techniques for distributed management, mining, and knowledge extraction from data repositories and warehouses. Most of them use existing Grid technologies and systems to build specific knowledge discovery services, data management, analysis, and mining techniques. Basically, this consists of either porting existing algorithms and applications on the Grid, or developing new mining and knowledge extraction techniques, by exploiting the Grid features and services. Grid infrastructures usually provide basic services of communication, authentication, storage and computing resources, data placement and management, etc. For example, the Knowledge Grid system uses services provided by the Globus Toolkit, and the ADMIRE framework uses a Grid system called DGET, developed by our team at the University College Dublin. We will give some details about the best-known DM/KD frameworks in section 2. Note that this chapter is not intended to Grid systems or the way they are interfaced with knowledge discovery frameworks. Indeed, beyond the architecture design of Grid systems, the resources and data management policies, the data integration or placement techniques, and so on, these DM and KD frameworks need

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Weighted-HR: An Improved Hierarchical Grid Resource Discovery

Grid computing environments include heterogeneous resources shared by a large number of computers to handle the data and process intensive applications. In these environments, the required resources must be accessible for Grid applications on demand, which makes the resource discovery as a critical service. In recent years, various techniques are proposed to index and discover the Grid resource...

متن کامل

Designing an Ontology for Knowledge Discovery in Iran’s Vaccine

Ontology is a requirement engineering product and the key to knowledge discovery. It includes the terminology to describe a set of facts, assumptions, and relations with which the detailed meanings of vocabularies among communities can be determined. This is a qualitative content analysis research. This study has made use of ontology for the first time to discover the knowledge of vaccine in Ir...

متن کامل

Knowledge Discovery Services and Tools on Grids

The Grid is today mainly used for supporting high-performance computing intensive applications. However, it could be effectively exploited for deploying data-driven and knowledge discovery applications. To support this class of applications, tools and services for knowledge discovery are vital. The Knowledge Grid is an high-level system for providing Grid-based knowledge discovery services. The...

متن کامل

Designing Grid services for distributed knowledge discovery

The increasing use of computers in all the areas of human activities is resulting in huge collections of digital data. Databases are common everywhere and are used as repositories of every kind of data. Knowledge discovery techniques and tools are used today to analyze those very large data sets to identify interesting patterns and trends in them. When data is maintained over geographically dis...

متن کامل

KNOWLEDGE GRID : High Performance Knowledge Discovery Services on the Grid

Knowledge discovery tools and techniques are used in an increasing number of scientific and commercial areas for the analysis of large data sets. When large data repositories are coupled with geographic distribution of data, users and systems, it is necessary to combine different technologies for implementing high-performance distributed knowledge discovery systems. On the other hand, computati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009